Full-gene sequence of the donkey leukocyte vaccine strain of the equine infectious anemia virus and their application

ABSTRACT

This invention discloses the full-length proviral genome DNA sequence of Donkey Leukocyte Attenuated Equine Infectious Anemia Virus Vaccine strain, nucleotide and amino acid sequences of all functional genes, and uses of each functional gene and its expression product. The vaccine strain involved in this invention was derived from an isolate in China.

This is a national stage application under 35 U.S.C. 371 of PCT/CN00/00096, filed on Apr. 21, 2000 designating the U.S., now abandoned.

FIELD OF INVENTION

This invention relates to the full-length genome sequence of the donkey leukocyte attenuated EIAV vaccine strain, as well as the sequences and potential uses of the functional genes and their expression products.

BACKGROUND

Equine Infectious Anemia Virus (EIAV), the pathogen of Equine Infectious Anemia, was the first kind of virus discovered by human. Equine Infectious Anemia, first described in France in 1843, has caused tremendous economic loss in animal husbandry for over one century. Scientists all over the world have been making efforts to investigate techniques to control this disease. Since the 1960s, the Chinese government has provided extraordinary financial support to fund studies on the biological and immunological properties of EIAV. During their investigations, Chinese scientists isolated a virulent strain that was quite different from strains isolated in other countries. Through passage on donkey luekocyte for many generations, this virulent strain was successfully attenuated. Using this attenuated virus as a vaccine strain, inoculated animals developed persistent and strong immunity against EIAV and were protected against Equine Infectious Anemia. The vaccine has been prepared in large scale since 1976 and used nation-wide since 1978. More than 70 million horses, mules, and donkeys have been vaccinated and Equine Infectious Anemia has been successfully controlled in China (Rongxian Shen, et al. Study on immunological methods of Equine Infectious Anemia, China Agricultural Science, vol. 4, 1–15, 1979).

EIAV is classified as a lentivirus of the Retroviridae family. In many aspects, such as its genomic organization, functional proteins and the regulatory mode of gene expression, EIAV shows great similarity to Human Immunodeficiency Virus (HIV), another lentivirus which is the cause of AIDS. (J. M. Coffin, The structure And Classification of Retroviruses, The Retroviridae, Vol. 1, p 19, edited by Jay A. Levy, Plenum Press). Moreover, there are extensive homologies in structure and function of reverse transcriptase, proteinase, envelope glycoprotein, and nucleoprotein between EIAV and HIV. Horses infected with EIAV manifest symptoms typical of lentiviral infection, including periodical fever, anemia with the decrease of erythrocytes and durative viremia. Thus, EIAV may serve as an important research model in investigating the infection mechanisms and the enzymatic functions of other lentiviruses (R. C. Montelaro et al, Equine Retroviruses, in: vol. 2, p 257).

With the development and widespread application of molecular biology since the 1970s, the genomic organization of EIAV has been intensively studied. Full-length genome sequences of EIAV from the standard (reference) virulent strain isolated in the USA (Wyoming strain), from another isolate in Japan (Goshum strain), and from several cell culture-adapted strains have all been registered in GenBank. None of these, however, are vaccine strains. Currently, the EIAV donkey leukocyte attenuated vaccine strain that was developed in China remains the only lentivirus vaccine strain in the world. It has been used in large scale and has been proven to be effective and safe over a long period of application (R. C. Montelaro, et al. in: Vaccine against Retroviruses, Vol. 4, P 605; R. C. Montelaro, et al. Equine Retroviruses, in: Vol. 2, P 257). The genomic sequence and organization of the EIAV vaccine strain has not been previously characterized. The characterization of the EIAV vaccine strain genomic sequence and structure will not only lay the foundation for refining the vaccine and preparing new genetic engineering EIAV vaccines, but also, more importantly, may provide a model for developing other lentivirus vaccines.

Therefore, the objects of the invention are to provide the full-length genomic sequence of the EIAV vaccine strain adapted to donkey leukocyte; the sequences of each functional genes and their putative amino acid sequences; and uses of these genes and their expression products.

SUMMARY OF THE INVENTION

The present invention relates to the full-length genomic sequence of the donkey leukocyte attenuated EIAV vaccine strain, including sequences of the main structural genes (gag, pol, env) and the regulatory genes (LTR, rev, s2, tat). The full-length sequence of this strain is 8258 bp long. The nucleotide sequence is shown in SEQ ID NO:1 (see below). In this full-length sequence, the 5′-terminal LTR sequence extends from position 1 to 325 and the 3′-terminal LTR spans from position 7922 to 8258. The gag gene occupies the sequence between 466 and 1926; the pol gene spans the sequence between 1689 and 5120; and the env gene extends from 5313 to 7904. The regulatory gene tat contains two exons: the first exon lies between positions 365 and 462 while the second spans from 5138 to 5276. Similarly, the regulatory gene rev also contains two exons: the first exon is found from position 5454 to 5546, while the second spans from position 7250 to 7651. The gene S2 lies between positions 5287 and 5493. The sequences of all these functional genes and their putative amino acid sequences are listed in SEQ ID NOs: 2 to 7 and SEQ ID NOs: 8 to 13, respectively. A genomic clone of the full-length gene sequence of EIAV donkey leukocyte attenuated vaccine strain was originally deposited to the China General Microbiological Culture Collection Center (CGMCC) on Apr. 19, 1999 with an accession number of CGMCC No. 0394. This deposit was transferred to an international deposit under Budapest Treaty on Apr. 19, 2000. As it is known by the skilled persons, the sequences shown in SEQ ID NOs: 1 to 7 may have some sequencing errors. If so, the deposited clone is the only data source.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

The objects of the invention have been realized using the following techniques. Because the donkey leukocyte-adapted EIAV strain is in the form of a provirus during its replicative phase in vitro and is integrated into the chromosomes of donkey leukocyte, the chromosomal DNA of donkey leukocyte infected with attenuated EIAV was extracted and used as templates for PCR amplification. Briefly, the preliminary PCR amplifications were first carried out by using different primers that were designed based on the genomic sequences of reference EIAV virulent strains. These PCR products were then sequenced. Specific primers that matched the fragment sequence of attenuated EIAV were synthesized and used for another round of PCR amplification. Finally the different PCR products, representing different regions of attenuated EIAV genome, were respectively cloned and sequenced. The full-length sequence including structural genes (gag, pol and env) and regulatory genes (LTR, rev, S2 and tat) of EIAV vaccine strain was obtained.

The ORF of full-length sequence was analyzed with GCG software (Genetic Computer Group, Inc., Wisconsin, USA). The nucleotide and the deduced amino acid sequences of different structural and regulatory genes were identified in the full-length sequence.

Compared with the sequences of the reference EIAV virulent strains that were released by Genbank, the nucleotide sequence homologies between each gene of the vaccine strain and that of the reference strains were 73.46%–90.06%. The env, rev and S2 genes were among those with lower sequence homologies between the vaccine and reference strains. The homologies in nucleotide sequences between env, rev and S2 of the vaccine strain and those of the reference strains were 73.46%, 73.54% and 75.76%, respectively, and the homologies in amino acid sequences were 67.41%, 64.85% and 54.51%, respectively.

In addition, the secondary structures of the structural and regulatory proteins of EIAV vaccine strain were predicted using the GCG software. A significant difference was shown in secondary structures of the env and tat proteins between the vaccine and references strains. The quantities and the locations of α-helices, β-sheets and β-turns in many domains were quite different. There is a hydrophobic group at the C-terminal in the tat protein of the vaccine strain, a β-sheet in proximal region that forms a concentrated hydrophilic group, and 4 β-turns at N-terminal. In contrast, no hydrophobic groups are found at the C-terminal in the tat protein from the reference strains. Instead, there are loose random coils in proximal regions rather than a β-sheet, although 2 separate and distinct hydrophilic groups can be found. Moreover, many β-turns are present at the N-terminals. These differences between secondary structures may indicate functional differences. Based on these differences, we can modify the related genes and proteins of HIV and other lentivirus, and investigate the use of these modified strains as possible vaccines.

Moreover, it was also found through the sequence analysis that there are 19 potential N-linked glycosylation sites in env gene of the vaccine strain, whereas 23 potential N-linked glycosylation sites can be found in the same region of the reference strain.

The present invention serves as the first report within China and abroad which characterizes in detail the genomic organization (including structural and regulatory genes and the related proteins) of full-length genome of EIAV vaccine strain. This invention will guide attempts to develop the vaccine for immunoprophylaxis of chronic diseases caused by other lentivirus. The nucleotide sequence and the serological diagnostic reagents derived from the genes and proteins of EIAV vaccine strain can be used in the diagnosis of EIAV infection, the differentiation of infection by wild-type virus and inoculation of EIAV vaccine, and the development of new vaccines through genetic engineering. The gene sequence reported here can also be used for differentiating horses immunized with EIAV vaccine from those infected with the EIAV American strain, as described in Example 2 below. Finally, the full-length genome of the EIAV vaccine strain can be used to construct the gene transfer vector that is generally used in gene therapy. Since EIAV cannot cause illness to humans, the EIAV-based transfer vector can be used more widely and safely than other vectors, such as the vector derived from murine leukemia virus (MLV).

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 shows the PCR amplification results of different templates using primer set I, as described in Example 2. Lanes 1 and 2 indicate the amplification results of total DNA from donkey leukocytes infected with the EIAV vaccine strain (˜200 bp); Lane 3 is the DNA marker (pBR322/BstNI); Lane 4 is amplification results of total DNA from donkey leukocytes infected with the reference strain (Wyoming strain).

FIG. 2 shows the PCR amplification results of different templates using primer set II, as described in Example 2. Lane 1 is the DNA marker (DL2000, TaKaRa); Lanes 2 and 3 are the amplification results of total DNA from donkey leukocytes infected with the EIAV vaccine strain (˜190 bp); Lane 4 indicates the amplification results of total DNA from donkey leukocytes; Lane 5 shows the amplification results of total DNA from donkey leukocytes infected with the reference strain (Wyoming strain).

FIG. 3 shows the PCR amplification results of different templates using primer set III, as described in Example 2. Lane 1 is DNA marker (pBR322/BstNI); Lane 2 shows amplification results of total DNA from donkey leukocytes infected with the reference strain (Wyoming strain). Lanes 3 and 4 are the amplification results of total DNA from donkey leukocytes infected with EIAV vaccine strain (˜380 bp).

FIG. 4 shows the PCR amplification results of different templates using primer set IV, as described in Example 2. Lanes 1 and 2 are the amplification results of total DNA from donkey leukocytes infected with EIAV vaccine strain (˜220 bp); Lane 3 is DNA marker (pBR322/BstNI); Lane 4 is amplification results of total DNA from donkey leukocytes infected with the reference strain (Wyoming strain).

The invention will be illustrated with reference to the drawings and following examples.

EXAMPLE 1

Sequencing of the complete genome of Equine Infectious Anemia Virus (EIAV) donkey leukocyte attenuated vaccine

A. Virus culture and extract of cell genome DNA

The Equine Infectious Anemia Virus (EIAV) donkey leukocyte attenuated vaccine (Harbin Veterinary Research Institute, Harbin, China) was inoculated into healthy donkey leukocyte culture at 37° C. with 100% bovine serum (1 ml/10⁷ cells). When cell pathological effect appeared, the supernatant was discarded, and the cell genome DNA was extracted using Qiagen Genomic DNA Kit (Qiagen, USA).

B. Amplifying, cloning and sequencing of different fragments of EIAV full-length genome

The gp90 and 5′ LTR genes of the invention were first amplified using primers designed according to the published sequences of Wyoming strains. Then, primers were designed according to the vested sequences to amplify the other ranges of the genome. The PCR products were subcloned into pGEM®-T Easy Vector (Promega) according to manufacturer's protocol. The restriction endonuclease EcoRI was used to identify recombination and ABI 377 DNA sequencer was used to sequence. Overlapping fragments of sequence were organized into a complete genome and every gene was identified by GCG software. The complete genome sequence is listed in SEQ ID NO. 1. Identified genes are presented in SEQ ID NO. 2–7. The following is concreted performance of different fragments:

(1) Fragment A(FA):

-   Amplified range: 5′LTR(1–320) -   Primers and method: -   First circle (SEQ ID NOS:14 and 15, respectively):     -   Forward primer: LTR-F1 5′-GCGCGCGAATTCTGTGGGGTTTTTATGAG-3′     -   Backward primer: GR11 5′-AACCTTGCTGCTATGGGAAT-3′     -   Reaction system: 10×buffer (Mg²⁺ 2 mM) 5 ul, dNTP (10 mM) 1 ul,         LTR-F1 (20 uM) 1 ul, GR11 (20 uM) 1 ul, Taq polymerase (Promega)         1 ul (1 u), template 1 ul (200 ng), H₂O 41 ul.     -   Reaction program: 95° C. 5 min; then 94° C. 1 min, 52° C. 1 min,         72° C. 1.5 min, for 30 cycles followed by 72° C. 10 min. -   Second circle (SEQ ID NOS:14 and 16, respectively):     -   Forward primer: LTR-F1 5′-GCGCGCGAATTCTGTGGGGTTTTTATGAG-3′     -   Backward primer: LTR-R1 5′-CCCCCTCTAGATCTAGGATCTGGAACAGAC-3′

The reaction system was the same as the first circle except the primer and the template was replaced with 5 ul of the product from the first cycle.

-   -   Reaction condition: 95° C. 5 min; then 94° C. 1 min, 55° C. 1         min, 72° C. 1 min, for 30 cycles followed by 72° C. 10 min.     -   Sequencing method: universal primer T7 was used.         (2) Fragment B (FB):

-   Amplified range: 115–1188

-   Primers (SEQ ID NOS:17 and 15, respectively) and method:     -   Forward primer: 732 5′-ACCGCAATAACCGCATTTGTGACG-3′     -   Backward primer: GR11 5′-AACCTTGCTGCTATGGGAAT-3′

The reaction system is same as the first amplification cycle of FA (above) except that primer 732 and GR11 were used. The reaction condition was the same as the first amplification cycle of FA.

-   Sequencing method:

Universal primers T7 and SP6 were used.

(3) Fragment C (FC):

-   Amplified range: 460–5254 -   Primers (SEQ ID NOS:18 and 19, respectively) used in the first     cycle:     -   Forward primer: P13 5′-GTAAGATGGGAGACCCTTTG-3′     -   Backward primer: EVENR1 5′-ATGCTGACCATGTTACCCCTT-3′ -   Primers (SEQ ID NOS:18 and 20, respectively) used in the second     cycle:     -   Forward primer: P13 5′-GTAAGATGGGAGACCCTTTG-3′     -   Backward primer: EVENR2 5′-CAGATACTGAGGTTGTCTTCCT-3′ -   Reaction system and condition:

Expend™ Long Template PCR System (Boehringer Mannheim) was used according to the protocol (the buffer is buffer 3); The PCR product was subcloned into a vector at common restriction sites and sequenced with universal primers T7, SP6, and the RLR10.

-   Sequencing method:

Initially, universal primers T7 and SP6 were used and the direction of gene insertion was decided. Then the primers named RTF1, RTR20, RTR40, and GF40 were designed and synthesized according to the vested sequence, so that the sequencing extended. By this means, the full-length sequence of a subcloned DNA fragment was obtained.

-   The DNA sequences of the primers (SEQ ID NOS:21 to 27, respectively)     are as follows:     -   T7 5′-TAA TC GAC TCA CTA TAG GGA GA-3′     -   SP6 5′-CAT ACG ATT TAG GTG ACA CTA TAG-3′     -   RTF1 5′-CCT CGA GGG AGA TGC ATA TTT C-3′     -   RTR20 5′-CCC TCA TCT CAT CCA TGT T-3′     -   RTR40 5′-GGT ATA GTG AAA TAT GCA TCT CC-3′     -   GF40 5′-AAG TGA GGG ACA TCC GGC T-3′     -   RLR10 5′-CTG TCC TCC CAT ACT TTC-3′         (4) Fragment D (FD): -   Amplified range: (5222–6715) -   Primers (SEQ ID NOS:28 and 29, respectively) and method:     -   Forward primer: F70 5′-CCAACAAGGAAGGACAACCTC-3′     -   Backward primer: R51 5′-AGACATAGTAGCGCTAGCAG-3′

The reaction system was the same as the first amplification cycle of FA, except that the primers F70 and R51 were used. The reaction condition was 95° C. 5 min; then 94° C. 1 min, 52° C. 1 min, 72° C. 2 min, for 30 cycles followed by 72° C. 10 min.

-   Sequence method:

Primers T7 and SP6 were first used to sequence, and primer (SEQ ID NO:30) LTR20 was designed from the resulting sequence (5′-GCCTTAATGCAACAGTC-3′). The complete sequence was obtained from LTR 20.

(5) Fragment E (FE):

-   Amplified range: (6680–8174) -   Primers (SEQ ID NOS:31 and 32, respectively):     -   Forward primer: EF51 5′-GCTACTGCTATTGCTGCTAG-3′     -   Backward primer: LR3 5′-CTCAGACCGCAGAATCTGAGT-3′ -   Reaction system and condition:

Amplification was performed according to the protocol of Expend™ Long Template PCR System (Boehringer Mannheim) (Buffer 3 was used);

-   Sequence method:

Universal primers T7 and SP6 were used.

(6) Fragment F (FF):

-   Amplified range: 3′LTR (7937–8251) -   Primers (SEQ ID NOS:31 and 16, respectively) used in the first     cycle:     -   Forward primer: EF51 5′-GCTACTGCTATTGCTGCTAG-3′     -   Backward primer: LTR-R1 5′-CCCCCTCTAGATCTAGGATCTGGAACAGAC-3′ -   Primers (SEQ ID NOS:14 and 16, respectively) used in the second     cycle:     -   Forward primer: LTR-F1 5′-GCGCGCGAATTCTGTGGGGTTTTTATGAG-3′     -   Backward primer: LTR-R1 5′-CCCCCTCTAGATCTAGGATCTGGAACAGAC-3′

The reaction system of the first cycle was similar to that of the first circle of Fragment A amplification, except that primers EF51 and LTR-R1 were used. The reaction condition is: 95° C. 5 min; then 94° C. 1 min, 52° C. 1 min, 72° C. 1.5 min, for 30 cycles followed by 72° C. 10 min.

The reaction system of the second cycle was similar to that of the first cycle mentioned above, except that primers LTR-F1 and LTR-R1 were used and 5 ul of amplified product of the first cycle was used as the template. The reaction condition is: 95° C. 5 min; then 94° C. 1 min, 55° C. 1 min, 72° C. 1 min, for 30 cycles followed by 72° C. 10 min.

-   Sequencing method:

Universal primer T7 was used.

EXAMPLE 2

Diagnosis methods distinguishing EIAV attenuated vaccine strain from American epidemic EIAV strains

This example demonstrates the use of the gene sequences of the invention in detecting EIAV pathogen characteristics. It is useful to diagnose horses immunized by EIAV attenuated vaccine strain from those infected by the American epidemic EIAV strains.

Using the genomic sequence of the EIAV donkey leukocyte attenuated vaccine strain of this invention and the sequence of international standard strain Wyoming (Genebank Accession No. AF028232), the following primers (SEQ ID NOS:33 to 39, respectively) specific to EIAV donkey leukocyte attenuated vaccine strain were designed and synthesized.

Primer name Position Sequence(5′-3′) Direction CHF1369 1369 GGACATCCGGCTGATATAAC Forward CHR1567 1567 GACCAGCTAATCCTGCTTGA Backward CHR1553 1553 GCTTGAAGTGCCTTGGCTAA Backward CHF5129 5129 TCAGGAATCACCACCAGTCAG Forward CHR5610 5610 GTTGTTGCCTCTCATACCAC Backward CHF1338 1338 AGACAGATTGCTGTCTC Forward CHR1572 1572 CATAGGACCAGCTATC Backward

PCR was performed on different templates including total DNA from donkey leukocytes inoculated with the EIAV attenuated vaccine and from donkey leukocytes inoculated with international standard strain Wyoming. Taq DNA polymerase was purchased from PROMEGA Corp. PCR system includes 10×buffer 5 ul, 2.5 mM MgCl₂ 1 ul, dNTP (2.5 mM) 1 ul, primer 1 (20 uM) 1 ul, primer 2 (20 uM) 1 ul, Taq polymerase (Promega) 1 ul (2 u), H₂O 34.6 ul, template 2 ul.

-   Primer 1 and primer 2 were combined as following group:     -   Group I: CHF1369+CHR1567 (Product is about 200 bp in length)     -   Group II: CHF1369+CHR1553 (Product is about 190 bp in length)     -   Group III: CHF5129+CHR5610 (Product is about 380 bp in length)     -   Group IV: CHF1338+CHR1572 (Product is about 220 bp in length)

Reaction conditions* 95° C. 5 min; then 94° C. 40 s, 50° C. 30 s, 72° C. 30 s, for 30 cycles followed by 72° C. 10 min.

The results were analyzed by agarose gel electrophoresis. It is demonstrated that the use of different primer groups leads to specific amplification products from the total DNA of the donkey leukocyte inoculated with the EIAV attenuated vaccine but not from DNA from donkey luekocyte inoculated with the international standard strain Wyoming under the same conditions. See FIGS. 1–4. 

1. An isolated polynucleotide comprising the nucleotide sequence as set forth in SEQ ID NO:1, wherein said polynucleotide encodes the full-length provirus genome of Equine Infectious Anemia Virus Donkey Leukocyte Attenuated Vaccine strain, wherein nucleotide 1 to 325 of SEQ ID NO:1 is the 5′LTR; nucleotide 7922 to 8258 is the 3′LTR; nucleotide 466 to 1926 is the gag gene; nucleotide 1689 to 5120 is the pol gene; nucleotide 5313 to 7904 is the env gene; nucleotide 365 to 462 is the first exon of tat gene; nucleotide 5138 to 5276 is the second exon of tat gene; nucleotide 5454 to 5546 is the first exon of rev gene; nucleotide 7250 to 7651 is the second exon of rev gene; and nucleotide 5287 to 5493 is the S2 gene.
 2. A method for the manufacture of a recombinant vector, comprising the step of inserting the polynucleotide of claim 1 in a suitable vector.
 3. An isolated polynucleotide comprising the sequence contained in the deposit under CGMCC deposit number
 0394. 